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ABSTRACT:   This  paper  develops  robust,  regression-based  forms  of 
Newsy's  conditional  moment  tests  for  models  estimated  by 
quasi-maximum  likelihood  using  a  density  in  the  linear  exponential 
family.   A  novel  feature  of  these  tests  is  that,  in  addition  to  the 
original  estimation,  they  require  only  two  linear  least  squares 
regressions  for  computation,  while  remaining  robust  to 
distributional  misspecif ications  other  than  those  being  explicitly 
tested.   Several  examples  are    presented  to  illustrate  the  simplicity 
and  scope  of  the  procedure:   a  Lagrange  multiplier  test  for 
nonlinear  regression,  the  score  form  of  the  Hausman  test  for  the 
parameters  of  a  conditional  mean,  and  a  regression  form  of  the 
Davidson-MacKinnon  nonnested  hypotheses  test.   All  of  the  tests 
assume  only  that  the  conditional  mean  is  correctly  specified  under 
the  null  hypothesis. 

Tests  for  second  moment  misspecif ication ,  developed  using 
White's  information  matrix  testing  principle,  assume  only  that  the 
first  two  moments  Are    correctly  specified  under  the  null  hypothesis. 
A  special  case  is  a  regression— based  test  for  heteroskedasticity  in 
nonlinear  models  which  relaxes  the  assumption  that  the  conditional 
fourth  moment  of  the  errors  is  constant.   Also,  a  simple 
distributional  test  for  the  Poisson  regression  model  is  presented. 

KEYWORDS:   Conditional  moment  tests,  robustness,  quasi— maximum 

likelihood,  linear  exponential  family. 


1.  INTRODUCTION 

Many  economic  hypotheses  can  be  formulated  in  terms  of  the 
conditional  expectation  E(Y  |X  )  of  one  set  of  variables  Y   given  a 
set  of  predetermined  variables  X. .   If  the  conditional  expectation 
is  known  up  to  a  finite  number  of  parameters  then  hypotheses  of 
interest  can  be  formulated  as  restrictions  on  parameters;  classical 
inference  procedures  are    then  available  for  formally  carrying  out 
the  appropriate  tests. 

Sometimes  economists  are  interested  in  comparing  two  models  for 
Y  ,  neither  of  which  contains  the  other  as  a  special  case.   In  this 
case  the  competing  economic  hypotheses  are  nonnested  in  a 
statistical  sense,  and  classical  testing  procedures  (e.g.  Wald, 
Likelihood  Ratio  and  Lagrange  Multiplier  tests)  aj^e   no  longer 
applicable.   There  ArB,    however,  several  tests  available  in  the 
presence  of  nonnested  alternatives.   The  Cox  (1961,1962)  approach  is 
useful  in  the  general  maximum  likelihood  setting.   Davidson  and 
MacKinnon  (1981)  derive  tests  for  nonnested  regression  models. 

If  one  is  not  interested  in  specific  alternatives  to  the 
postulated  regression  function,  but  is  concerned  about 
misspecif ication  which  leads  to  inconsistent  estimates  of  economic 
parameters,  then  the  Hausman  (197B)  methodology  is  available. 

This  paper  develops  a  class  of  Newey's  (1985)  conditional 
moment  (CM)  tests  that  are  explicitly  designed  to  detect 
misspecif ication  of  a  conditional  expectation.  The  class  of  tests 
considered  is  broad  enough  to  contain  the  three  types  of  tests 
mentioned  above,  and  allows  for  time  series  as  well  as  cross  section 


observations . 

Broadly  speaking,  the  setup  here  is  encompassed  by  White 
(1985b),  who  extends  Newey's  work  and  develops  a  framework  which 
includes  conditional  moment  tests  for  time  series  observations  as  a 
special  case.   However,  specializing  White's  results  to  a  class  of 
moment  restrictions  intended  to  detect  (dynamic)  misspecif ication  of 
the  regression  function,  and  restricting  attention  to  the  class  of 
quasi-maximum  likelihood  estimators  ( QMLE ' s )  derived  from  a  density 
in  the  linear  exponential  family  ( LEF ) ,  allows  derivation  of  simple 
forms  of  these  tests  without  additionally  assuming  that  the 
conditional  density  is  correct  under  the  null  hypothesis.   If  the 
conditional  mean  is  the  object  of  interest,  then  a  test  which 
further  assumes  that  the  distribution  is  correctly  specified  will 
generally  have  the  wrong  asymptotic  size  for  testing  the  relevant 
null  hypothesis.   Moreover,  standard  regression  forms  of  conditional 
mean  tests  are  inconsistent  for  testing  distributional  assumptions 
beyond  the  first  moment. 

A  useful  feature  of  the  robust  tests  derived  here  is  that,  in 
addition  to  the  QMLE  estimation,  only  two  linear  least  squares 
regressions  are  required  for  computation.   In  many  cases  the 
statistics  needed  for  the  auxiliary  regressions  3.re    computable  from 
the  final  iteration  of  the  Berndt,  Hall,  Hall  and  Hausman  (BHHH) 
algorithm.   One  simple  consequence  of  the  results  here  is  that 
heteroskedasticity— robust  Lagrange  Multiplier  tests  for  exclusion 
restrictions  in  a  dynamic  linear  model  are  computable  by  running  a 
total  of  three  linear  regressions. 


Although  correctly  specifying  the  regression  function  is 
usually  the  primary  concern  of  the  applied  econometrician ,  it  is 
also  useful  to  know  whether  the  distribution  is  correctly  specified. 
Section  5  of  this  paper  derives  a  modification  of  White's  (1982) 
information  matrix  test  which  is  designed  to  detect  misspecif ication 
of  the  conditional  second  moment.   An  interesting  feature  of  the 
test  derived  here  is  that  it  is  regression-based  while  remaining 
robust  to  misspecif ication  of  other  aspects  of  the  distribution: 
the  null  hypothesis  states  only  that  the  first  two  conditional 
moments  are    correctly  specified.   As  a  special  case,  it  yields  a 
regression  form  of  the  White  (19S0a)  test  for  heteroskedasticity  for 
nonlinear  regression  which  does  not  require  that  the  errors  have  a 
constant  conditional  fourth  moment.   In  addition,  it  gives  simple 
tests  for  distributional  specification  in  such  interesting  cases  as 
the  Poisson  regression  model . 

The  remainder  of  the  paper  is  organized  as  follows.   Section  2 
introduces  the  general  setup  and  briefly  discusses  some  useful 
properties  of  LEF  distributions.   Section  3  derives  the 
computationally  simple  conditional  mean  test  statistics.   Several 
examples  of  conditional  mean  tests  are  presented  in  Section  4. 
Section  5  considers  a  modified  information  matrix  test  for 
conditional  second  moments,  and  Section  6  contains  some  concluding 
remarks. 


2.  NOTATION  AND  SETUP 


Let  l(Y  ,Z  ):  t=l,2,...}  be  a  sequence  of  observable  random 

vectors  with  Y^  1;;K,  Z^  1;:L.   The  variables  Y^  are    the  dependent  or 
t        t  t 

endogenous  variables.   Interest  lies  in  explaining  Y   in  terms  of 

the  explanatory  variables  Z   and  (in  a  time  series  context)  past 

values  of  Y   and  Z  .   Let  X   =  ( Z  , Y    , Z    , , . . , Y  , Z  )  denote  the 

predetermined  variables  (Z   may  be  excluded  from  X   without 

alternating  the  following  results).   The  conditional  distribution  of 

Y   given  X   =  x   always  exists  and  is  denoted  D  (• |x  ).   Under  weak 

conditions  on  D  (-jx  )  there  exists  a  conditional  density  p,(y. |x.) 

with  respect  to  a  o-finite  measure  v  (dy  )  (see  Wooldridge  (1987)). 

Because  (by  definition)  we  are    not  interested  in  the  stochastic 

behavior  of  [Z,]-.  the  conditional  densities  p.(v,  Ix.)  describe  the 
t  ■  t  ■  t  •  t 

relevant  dynamic  behavior  of  [Y  } . 

We  choose  for  p.(y.  |x  )  a  class  of  conditional  densities 

Cf .  (y .  I '< .  jfii.  ,  Ti.  )  J  (which  may  or  may  not  contain  p.(y.  |x  ))  which 

comprisees  of  members  of  the  linear  exponential  family.   In 
particular , 

(2.1)  log  f,(y,  |x  ,m  ,Tl.)  = 

where  m   is  IxK,  T;   is  IxJ,  c(-)  is  Kxi,  and  a(-)  and  b(-)  are 
scalars.   The  function  m  (x  )  is  the. expectation  associated  with  the 
density  f .  ( y.  I x  , m  , n,  ) .   This  entails  the  restriction 

(2.2)  m^(x^)V^c(m(x^)  ,n^(x^)  )  =  -V^a  (  m^(  x^)  , -q^  (  x^)  ) 
for  all  X  . 


Following  Gouneroux,  Monfort  and  Trognon  (1984a)  (hereafter 
GMT  (1984a)),  the  LEF  family  is  parameterized  through  the 
conditional  mean  function 

(2.3)  Cm^ds^.e)  :  e  €  ©  c  K*^} 
and  the  nuisance  parameters 

(2.4)  Cn^(:<^,n)  :  n  e  n  <r  k"^}. 

In  the  context  of  Section  3,  correct  specification  means 
correct  (dynamic)  specification  of  the  conditional  mean,  i.e. 

(2.5)  E(Y^|X^=x^)  =  m  (K  ,e  )     for  some  9   e  e,  t=l,2,.... 

As  GMT  (1984a)  have  shown  in  the  case  of  independent 
observations  and  as  White  (1985a)  has  shown  in  a  more  general 
dynamic  setting,  the  LEF  class  of  densities  has  the  useful  property 
of  consistently  estimating  the  parameters  of  a  correctly  specified 
conditional  expectation  despite  misspecif ication  of  other  aspects  of 
the  conditional  distribution. 

The  nuisance  parameter  n  may  be  assigned  any  value  in  fl;  more 
generally,  it  may  be  replaced  by  an  estimator  tt   -  n   ^  O  where  [tt  > 
c  n.   The  "T"  subscript  on  the  plim  of  tt   is  generally  required 
because  n   need  not  be  estimating  any  "true"  parameters  (parameters 
of  interest)  even  when  (2.5)  holds.   For  example,  n   could  be  an 
estimator  from  a  misspecified  parameterized  conditional  variance 
equation  in  the  context  of  weighted  nonlinear  least  squares,  or  it 
could  be  an  estimator  from  an  alternative  model  in  a  nonnested 
hypotheses  framework.   When  the  data  a^re    strictly  stationary,  tt 
does  not  depend  on  T.   In  Section  5,  when  second  moment 


specification  is  considered,  the  plim  of  tt^  is  n   where  (9  ,n  ) 
"^  To  o  ■  o 

indexes  the  conditional  variance  of  Y   given  X   under  the  null 
hypothesis . 

The  DMLE  0^  maKimizes  the  quasi  log-likelihood  function 

T 
L  (e;Y,Z)  =  E  ^.  (Y  ,X  ;tt  ,e) 
t=l 

where  -f.  (y.  »:<  ;tt,G)  is  the  conditional  log  likelihood 

a(m^(K^,e)  ,Ti^(x^,n)  )  +  y^c  (m^(  :;^,  0)  ,  t\^(  x^,  n  )  ) 

(note  that  b  ( y  ,  n .  ( x  ,  tt  )  )  does  not  appear  since  it  does  not  affect 

the  optimization  problem).   For  simplicity,  the  argument  ;c   is 

omitted  wherever  convenient.   Letting  s^.,  (6)  =  V  /  (e,n  )  and 

Tt  ©    t  I 

suppressing    the    dependence    of    s         on    n^    (and    hence    on    T)    we    have 

(2.6)  s     (0)    =    V   a(m. (e))V   m    (0)    +    Y    V   c (m^ ( 0) ) V^m . ( 0) 

t  mt  ©t  tmt  ©t 

=    (Y^    -    m^(©) )V^c(m^(0) )V^m^(0) 

=   U.(©)V   c(m,(©))V   m    (©) 
t  m         t  ©    t 

where  U^(©)  =  Y   -  m  (X  ,©)  is  the  IxK  residual  function.   The 

second  equality  follows  from  (2.2).   If  the  mean  is  correctly 

specified  then  E(Y^|X^)  =  m^(X^,©  )  and  the  true  residuals  U,  = 

t    t         t    t  ■   D  t. 

U.(©    )     are    defined.        In    this    case,     because    V^m^(©    )     and    V    c,(m.(©    )) 
to  ©to  mtto 

depend  only  on  X  , 

(2.7)  E(s^(©^)  iX^)  =  0. 

This  shows  that  [s^  =  s^(0  )  :  t=l,2,...>  is  a  martingale  difference 

t     t   o 


sequence  with  respect  to  the  cr-fields  •Ccr(Y  ,X  ):  t=l 


Equation  (2.7)  also  establishes  Fisher — consistency  of  the  QliLE  (when 
n   is  replaced  by  its  pi im ,  or  any  fixed  value )  and  is  the  basis  for 


the  GMT  (1984a)  results  for  dynamic  models.  It  can  also  be  shown 
(see  GMT  (19843))  that  if  the  conditional  mean  is  correctly 
specified  then 

(2.8)  E(h^(G^)|X^)  =  -Vm°'Vc°Vm° 

where  h  (9)  =  V  s  (e)  and  values  with  a  "□"  superscript  are 
t       y  t 

evaluated  at  (©  ,  n^. )  .   The  conditional  variance  of  the  score  is 
o   I 

(2.9)  V(s°|X^)  =  7m°' Vc°Q°(X^)Vc°Vm°  '  ' 

where  O  ( :;  )  =  V(Y  |X  =k  )  is  the  true  conditional  covariance  matrix 

of  Y.  given  X..   It  can  be  shown  that  [V  c (m^ ( 0) , n. ( n ) ) ]    is  the 
t         t  m    t      t 

(zavar±Bn<zB    matrix  associated  with  the  density  f  .  (  y  .  |  x  ,m  (  9)  ,  ti,  (  tt  )  )  . 
The  conditional  information  equality  holds  provided 

(2.10)  0°(X.)  =  [V  c(m.  (X,  ,9  )  ,  n.  (  X  .  ,  tt°)  )  ]~^    t=l,2 

tt       m    ttottT 


and  this  is  the  case  if  the  assumed  density  (evaluated  at  (9  .n^.)) 

o '   I 

has  second  moment  corresponding  to  the  actual  conditional  covariance 

of  Y  .   In  general, 

T 
A°  =  -T  ^  E  E[h^(9^,iT°)]      and 

T 
B°  =  T    E  ECs^(9^,n°)'s^(9^,TT°)] 
t=l 

differ  even  when  the  conditional  mean  is  correctly  specified,  so 

that  the  information  equality  fails.   Under  standard  conditions, 

1/2  '^  <->— 1  o  '^— ■•  o 

T    (9^  -  ©  )  converges  in  law  to  N(0,A"  "B_A"  ")-   Because  A_  and 

B^  can  be  consistently  estimated  by  positive  semi— definite  matrices 

which  require  only  first  derivatives  of  m   (with  respect  to  9)  and  c 

(with  respect  to  m),  robust  classical  inference  is  fairly 

straiqhtf orward  for  this  class  of  QMLE ' s .  The  next  section  derives 


regression-based  specification  tests  which  allow  robust  inference 
for  a  wide  variety  of  testing  procedures. 

3.  CONDITIONAL  MEAN  TESTS 

This  section  focuses  on  a  class  of  specification  tests  designed 
to  detect  departures  from  the  hypothesis 

(3.1)      H^^:  E(Y^|X^)  =  m^(X^,e^),  for  some  e^  e  ©,  t=l,2,.,.. 

Let  U^O)  =  Y^  -  m^O)  be  the  l:cK  residual  function,  and  let  U°  = 
U  (0  )  be  the  "true"  residuals  under  H  .   Suppose  that  A  (X  )  is  a 
KkQ  matrix  function  of  the  predetermined  variables  X  .   If  (3.1) 
holds  then  by  the  law  of  iterated  expectations  (assuming  existence 
of  the  appropriate  moments), 

ECU°A^(X^,e^,n°)Vc^(e^,TT°)]  =0,     t=l,2,.... 

Note  that  A,  is  allowed  to  depend  on  9   and  the  limiting  value  of 
t  "^         o 

the  nuisance  parameter  estimator. 

As  pointed  out  by  Newey  (1985)  and  Tauchen  (1985),  (3.2) 

suggests  basing  a  test  of  (3.1)  on  a  quadratic  form  in  the  Qxl 

vector 

_1  T     ...   ... 

^         =     T  Z      ^   (9    TT   ) 

t=l 

where  4j^(9,n)  =  A^  (9,  n  )  '  VCj_  (9,  rr )  Uj_  (  G)  '  .   It  is  readily  seen  that  the 

asymptotic  distribution  of  T  '^^-r    does  not  depend  on  that  of  tt 

.1/2 

ln_  -  n_ )  IS  L 
P 

suitable  regularity  conditions,  it  is  straightforward  to  estiblish 


provided  T  '"(n.  -  n_ )  is  0  (1),  which  is  typically  the  case.   Under 

IT        p     ■ 


that  under  H . , 


8 


(3.2) 

where 

(3.3) 

and 

(3.4) 

T^/^i^  ^  N(o,r°v°r°') 


T  _ 

r°    =   T         E  CIo  !  -E(A°'Vc°Vm°)-tE(Vm°'Vc°Vm°)}   ] 
T        |._hG!         ttt       ttt 


■  1°      -r~l   1-1  I  1/  I-   °    °T  »  \       -r~l   t-ii-/r   °     °T/r   °    °T\ 

V-  =  T    E  '^(  t4'^.-s  :'  )  =  T    E  E:(  C^i.  ,s  ]'  [4J  ,s  ]  )  . 

t=i  t=l 


Equation  (3.2)  can  be  used  as  the  basis  for  testing  the  correct 
specification  of  the  conditional  mean,  with  the  resulting  statistics 
being  robust  against  misspecif ication  of  other  aspects  of  the 
conditional  distribution  of  Y   given  X  .   All  that  one  needs  are 
consistent  estimators  of  T^  and  V--,  and  these  are    available  from 


^.  y\ 


(3.3),  (3.4),  U  ,  e^  and  tt  .   However,  computation  of  the  resulting 
test  statistic  requires  special  programming.   A  method  for  computing 
the  test  statistics  that  requires  only  auxiliary  least  squares 
regressions  can  substantially  reduce  the  computational  burden  and 
give  insight  about  the  directions  of  misspecif ication  for  which  the 
test  is  inconsistent. 

To  motivate  the  general  approach,  consider  a  familiar  example. 
For  simplicity,  assume  that  the  observations  3.re    independent  so  that 
X,  =  Z^,  and  consider  the  linear  model 

(3.5)  E(Y,JX.J  =  X^^.^  .    X^^^p^ 

where  Y   is  a  scalar,  X. ,  is  IxP,  and  X.„  is  lxP„.   Suppose  that  the 

~  tl  1  t.L  ^ 

hypothesis  of  interest  is 

(3.6)  H^:  p   =0. 

0    o 

The  LM  approach  leads  to  a  test  based  on  the  sample  covariance  of 


the  residuals  estimated  under  the  null  and  the  excluded  variables 

-1  ^       - 
(3.7)  '^     I.  x;-u  . 

Suppose  that,  instead  of  directly  using  (3.7),  the  part  of  X    that 

is  correlated  with  X.„  is  first  removed  from  X.„.   That  is,  perform 

t-i.  t^ 

a  multivariate  regression  of  X    on  X    and  form  the  residuals  X    = 

X    -  X^V'B   ,  t=l,...,T  where  B    is  the  P   x  P^  matrix  of  least 

squares  coefficients: 

T        ,-1  T 


^1  "  [j/ti>^ti]    ^i:/i 


T 

Then,  because  J]  ^ti'-'t  ~  "^ '  ^^^  statistic  in  (3.7)  is  identical  to 
t=l 

that  obtained  by  replacing  X    with  X   : 

(3.8)      5=1    ^'-^Z   X'  U   =  T  ^'^i:  X'  U 

t=l  t=l 


Th 


e  advantage  of  working  with  J^_  expressed  in  terms  of  X'^U.  is  that 

T  t-i  t 

it  can  be  expanded  as 

(3.9)    a^  =  T"^^^  Z    (Xt2  -  ^tl^Tl^'*^? 

^~^    T 

-1/2   o     '"■  '~- 

t  =  l 
where 

By  the  first  order  condition  for  the  OLS  estimator  cc^ ,  the  third 
term  on  the  right  hand  side  of  (3.9)  is  identically  zero.   Also  note 
that  E^   is  defined  so  that 


10 


By  the  weak  law  of  large  numbers  (WLLN), 

1/7  -^ 
Combined  with  T    (cx_  -  ex  )  =  0  (1),  this  shows  that  the  second  term 

I     o      p 

in  (3.9)  is  o  (1)  under  general  conditions  if  H^  is  true.   Deriving 
p  O 

the  limiting  distribution  of  the  LM  statistic  therefore  reduces  to 
deriving  the  limiting  distribution  of 

(3.10)  T-^'^J/>:t2-  >=tl^Tl^'^t- 

Under  standard  regulartiy  conditions,  if  H^  is  true,  (3.10)  is 
asymptotically  N(0,ZL.)  where 

(3.11)  Z^  =   t"^  i:  E[(U°)^(X^2-X^^B°^)'(X^2-^'ti^?i^^- 

O  D  2 

Note  that  Z^  is  the  correct  expression  whether  or  not  E[(IJ  )  |X.3  is 
constant.   Following  White  (1980a),  a  consistent  estimator  of  2L.  is 

(3.13)  ^  =  T~^  C  '^?(^t2"^tl^Tl^'  ^^t2~^tl^Tl^- 

For  testing  H  ,  this  modified  LM  approach  leads  to  the  statistic 

(3.14)  ^T-n-^T 

which  is  distributed  asymptotically  as  XT   under  H.  in  the  presence 

cf  heteroskedasticity  of  unknown  form.   From  a  computational 
viewpoint,  it  is  useful  to  note  that  (3.14)  is  computable  as  TR^, 
where  the  K'~    is  the  uncentered  r — squared  from  the  auxiliary 

rearession 
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(3.15)  1       on      ^t-^"^!         t=l,...,T. 

Interestingly,  the  regression  in  (3.15)  yields  a  test  statistic 
that  is  numerically  equivalent  to  what  would  be  obtained  by  applying 
White's  (19B4,  Theorem  4.32)  robust  form  of  the  LM  statistic  to  the 
case  of  heteroskedasticity .   The  preceding  analysis  shows  that  the 
White  statistic  can  be  computed  entirely  from  least  squares 
regressions.  i        i 

The  procedure  outlined  above  can  be  compared  to  the  standard  LM 
approach.   If  the  assumption  of  conditional  homoskedastici ty  is 
maintained  and  EL.  is  estimated  by 

(3.16)  ^  =  a^  "^"\i:/\2-^l^Tl)'(\2-^Al^' 


.■I'j'^'-p  x^/'v -^    J 


where  cr.p  =  T   J^  U*^,  then  the  resulting  test  statistic  9y^  9-t-  is 
t=l 

exactly  the  r-squared  form  of  the  LM  statistic,  which  is  obtained 
from  the  regression 

(3.17)  U^     on    X^^,  X^^       t=l,.,.,T=, 

This  regression  has  an  uncentered  r — squared  which  is  identical  to 
the  uncentered  r— squared  from  the  regression 

(3. IB)  U^     on    X^^  -  X^^B^j.^  t  =  l,...,T 

verifying  that  the  robust  statistic  obtained  from  the  regression  in 
(3.15)  is  asymptotically  equivalent  to  the  statistic  obtained  from 
(3.17)  (or  (3. IB))  if  H   is  true  and  Y   is  conditionally 
homoskedastic .   In  general,  as  emphasized  by  White  ( 1980a , b, 1984 )  in 
several  contexts,  the  reqression  statistic  based  on  (3.17)  is  not 
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asymptotically  Xi,       under  H.  if  heteroskedastici ty  is  present.   The 

statistic  based  on  the  regression  in  (3.15)  does  have  a  limiting  X^ 

2 

under  H   in  the  presence  of  heteroskedastici ty  of  unknown  form. 
Thus  the  statistic  based  on  (3.15)  is  preferred. 

To  extend  the  above  approach  for  general  CM  tests,  recall  the 
first  order  condition  for  the  DMLE  6^: 

(3.20)  J:  VQm^(e^)' V^c^(m^(e^),n^(n^)  )(Y^-m^(e^)  )  '  =0 
or,     in  shorthand, 

(3.21)  £  vm;vc  u;  =  0. 

t=l 

Therefore,  the  indicator  *   =  T   J]  A'Vc.U   is  identical  to 

t=l 

(3.22)  T  -"j:  (Vc:^''A^-Vc^^'-Vm^B^)'Vc^'^^U^ 

where  B^  is  the  P  x  Q  matrix  of  regression  coefficients  from  a 

matrix  regression  of  Vc'^A^    on  Vc^  *"Vm^: 

t    t       t     t 

(3.23)  B^  =    {  T   Vm' Vc.  Vm^l~"'-r  Vm:Vc'  A^. 

'       ^=1  ^      ^      ^J    t=l   t   t  t 

The  statistic  in  (3.22)  first  purges  from  Vc   "^A   its  least  squares 
projection  based  on  Vc  '  *"Vm   before  constructing  the  indicator. 
Note  that  Vc   ~  is  an  estimator  of  [V  ( Y  |  X  )  ]~"'"'^*^  if.  the  second 
moment  is  correctly  specified,  but  not  in  general.   It  can  be  shown 
as  in  the  linear  least  squares  case  that 

(3.24)  T  *^"j:  (Vc:'-A^-Vcy'"Vm^B^)'Vcy-U' 

=  T-^/4  (Vc^f  A°-Vc^fvm°B°)'Vc^fu°'   .    o  ( 1 )  , 
~     ot   t    ot    t  T     ot   t       p 
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where 

B°  =   \t'^    I   E(Vm°'Vc°Vm°)j-H-^  I   E(Vm°'Vc°A°) 

and  values  with  "o"  superscripts  are  evaluated  at  (©  ,n  ).  Under 
H  ,  a  consistent  estimator  of  the  asymptotic  covariance  matrix  of 
the  right  hand  side  of  (3.24)  (and  therefore  of  the  LHS  as  well)  is 

(3.25)  J         T.    (Vc:^'^A^-Vc^'-"Vm^B^)'Vc^'"U^U^Vc^''"(Vc^''A^-Vc^;'Vm^B^) 

Equations  (3.24)  and  (3.25)  lead  to  the  following  theorem. 


Theorem  3.1;  Suppose  that  the  following  assumptions  hold: 
(i)  Regularity  conditions  A.l  in  the  appendix, 
(ii)  H.:  E(Y^IX^)  =  m^(X^,e  ),    for  some  0   e  ©,   t=l 

U         t  '   t         t    t  "   D   ■  O 


Then 

where  ?^  =  TR^  and  R'^  is  the  uncentered  r — squared  from  the  auxiliary 
regression 


on     U  Vc  (A   -  Vm  B  )         t=l,...,T, 


and  B   is  given  by  (3.23). 


In  practice.  Theorem  3.1  can  be  applied  as  follows: 


(i)  Given  the  nuisance  parameter  estimate  rr  ,  compute  the 


:i/2; 


QMLE  e^,  the  weighted  residuals  U^  =  Vc^  "^U^ ,  the  weighted 


regression  function  Vm   =  Vc   ~Vm  ,  and  the  weighted  indicator 

function  A,  =  Vc.  ""A.  ; 
I-      t    t  ■ 
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(ii)  Perform  a  multivariate  regression  of  A   on  Vm   and 
keep  the  residuals,  say  A  ; 

(iii)  Perform  the  OLB  regression 

(3.26)  1       on  U^A^       t=l,2,...,T 

T*  *">  '7 

and  use  TR"  as  asymptotically  X^  under  H^,  where  R"    is  of  course  the 
uncentered  r — squared.   ■ 

This  procedure  assumes  that  the  matrix 

is  positive  definite  uniformly  in  T.   This  condition  can  fail  if  the 

weighted  indicator  V  c.   *'A,  contains  redundancies  (more  precisely, 

m  t     t 

if  V  c,   "A^  -  7  c^   ^V^m^B^  contains  redundancies).   In  this  case 
mt     t     mt     6tT 

the  regression  form  can  still  be  used,  but  the  degrees  of  freedom  in 
the  chi— square  distribution  must  be  appropriately  reduced. 

The  above  procedure  gives  a  simple  method  for  testing 
specification  of  the  conditional  mean  for  a  broad  class  of 
multivariate  models  estimated  by  QtiLE,  without  imposing  the 
additional  assumption  that  the  conditional  variance  of  Y   is 
correctly  specified  under  the  null  hypothesis.   Note  that  if  the 
second  moment  of  Y^    is  correctly  specified  then  the  weighting  matri)( 
appearing  above  is  the  negative  square  root  of  the  conditional 
variance  of  Y^.   In  general,  the  weiqht  corresponds  to  the  variance 
of  the  assumed  distribution,  and  need  not  equal  the  conditional 
covariance  matrix  of  Y   given  X  . 

The  regression  appearing  in  (3.26)  is  similar  to  auxiliary 
regressions  appearing  in  Newey  (1985),  White  (19S5b),  and  elsewhere. 


but  there  is  an  important  difference.   Consider  the  regressions 

(3.27)  1     on     U^Vc^Vm^,  U^Vc^A^ 

(3.28)  Vc^  '"U^    on    VcJ  "'^m^,  7c^   A^ 

where  the  multivariate  regression  in  (3.28)  (when  K  >  1)  is  carried 
out  by  stacking  the  data  and  using  OLS .   If  [Vc  (0  ,n^)]    = 

o 

V(Y. |X.)  then  the  TR^  from  either  of  these  regressions  is 

*-» 

asymptotically  distributed  as  XT  under  H  .   If  the  conditional 

second  moment  of  Y   is  misspecif  ied ,  i.e.  CVc  (6  ,n  )]    ?=  "^(Y  |X  ), 
then  neither  of  these  statistics  generally  has  an  asymptotic  X^ 
distribution  (although  the  statistic  obtained  from  (3.27)  has  a 
better  chance  of  having  a  limiting  chi-square  distribution),  and 
they  Bre    no  longer  asymptotically  equivalent.  The  statistic  derived 
from  (3.26)  has  a  limiting  XZ    distribution  under  H   whether  or  not 
the  second  moment  is  correctly  specified.   If  the  conditional  mean 
is  the  object  of  interest,  and  the  researcher  is  at  all  uncertain 
about  the  distributional  assumption,  then  the  methodology  of  Theorem 
3.1  is  preferred. 

The  setup  of  Theorem  3.1  allows  consideration  of  a  wide  class 
of  procedures  used  by  applied  econometricians .   Some  examples  of  how 
to  choose  A   in  some  familiar  cases  atb    given  in  the  following 
section . 
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4.  EXAMPLES  OF  CONDITIONAL  MEAN  TESTS 

The  approach  taken  in  Theorem  3.1  was  motivated  by  considering 
the  LM  test  for  exclusion  restrictions  in  a  linear  model  with 
independent  observations.   This  section  presents  some  further 
applications  of  Theorem  3.1.   Included  are  robust  LM  tests,  robust 
regression  forms  of  Hausman  tests,  and  a  modified  Davidson-MacKinnon 
test  of  nonnested  hypotheses.  ' 

E:;ample  4 . 1  (LM  test  in  nonlinear  regression):   For  simplicity, 
assume  that  K=l,  so  that  Y   is  a  scalar.   Consider  estimating 

E(Y  |X  )  by  nonlinear  least  squares.   In  this  case,  the  nuisance 

2 
parameter  n    =    a"    is  the  variance  associated  with  the  assumed  LEF 

distribution,  N  (  m  ( X  ,  ex,  P )  ,  ct~)  .   Here  oc  is  P   x  1,  p  is  P^  x  1. 

Assume  that  E(Y  |X  )  =  m  ( X  ,  oc  ,p  ).   The  null  hypothesis  is 

^0=  Po  =  ""■ 
The  LM  approach  leads  to  a  statistic  based  upon  the  P^  ;<  1  vector 

J^VpMt(X^,;^,0)'G\ 

where  oc^  is  the  NLLS  estimator  of  ex      obtained  under  the  assumption 

^*v  j^  Xs.  J-S 

that  p   =  0,  and  U^  =  Y^  -  |4^  ( X^ ,  oi^,  0)  .   Let  V  m^  ,  V^m^  denote  the 
o  t     t     t   t'  T  ex  t    p  t 

^■•^ 
gradients  of  u   with  respect  to  oc,  g  evaluated  at  (cxl1-,0)'  .   In  the 

notation  of  Theorem  3.1,  9  =  a,  m  (9)  =  n  (cx.O),  c(m,r\)  =  m/ri  and 

Ti(n)  =  n  =  a".       Also,  the  indicator  A.  (a,n)  is  V  u  (a,0)  (where  X 

t   ■        p  t.  t 

has  been  suppressed).   A  test  of  H   which  is  robust  in  the  presence 
of  heteroskedasticity  can  be  easily  computed  using  the  following 
procedure. 
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( i )  Estimate  ex   by  NLLS  assuming  P   =  0 «   Compute  the  residuals 

U^  ,  and  the  qradients  V  M^(ot_.0)  and  7^  u^  (  o^  -  0  )  . 
t  ■  ex  t   T  3  t   T 

^-.        .■^.  , , 

(ii)  Regress  V  m   on  V  m   and  keep  the  residuals,  say  7  m  . 

p   t         ex   L  p   "L 

'7 

(iii)  Regress  1  on  U.V  u   and  use  TR^  from  this  regression  as 

t  p  t 

asymptotically  Xp.   under  H  . 

As  a  special  case  of  this  procedure,  consider  the  LM  test  for 
AR(1)  serial  correlation  in  a  dynamic  model-  The  null  and 
alternative  hypotheses  are 

H^i     E(Y^|X^)  =  m^(X^,a^)       t=l,2,... 

H^:  E(Y^|X^)  =  m^(X^,a^)  +  p^(y^_^    "  '"^_^  (  X^_^  ,  a^)  )     t=2,3,.. 

The  LM  procedure  leads  to  a  test  based  upon 

T  ..  ^ 

^  '^t'^t-l 
t=2 

where  U   =  Y   -  m  (X  ,oc^)  and  cx.^  is  the  NLLS  estimator  obtained 
under  the  assumption  of  no  serial  correlation.   Theorem  (3.1)  leads 
to  the  following  procedure: 


(i)  Estimate  «   by  NLLS.  Keep  U^ ,  Vm^  =  V  m^(X^,o-_ 
o  t     t     attT 


(ii)  Regress  U     on  Vm   and  keep  the  residuals  from  this 
regression,  say  U    . 

(iii)  Regress  1  on  U  U     and  use  (T-l)R^  from  this 
regression  as  asymptotically  XT  under  H  . 

Note  that  this  procedure  assumes  nothing  about  the  conditional 
variance  of  Y   given  X  .   Also,  X   may  contain  lagged  values  of  Y  , 
as  well  as  Z   that  are    not  strictly  exogenous-   This  procedure 
maintains  the  spirit  of  the  usual  LM  procedure,  but  is  robust  to 
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heteroskedastici ty .   The  OLS  regression  in  (ii)  is  the  cost  to  being 
robust  to  heteroskedasticity . 

Extending  the  above  procedure  to  test  for  AR(Q)  serial 
correlation  is  straightforward.   In  (ii),  regress  U    ,  .  .  .  ,  L)     on 

"7m   and  save  the  residuals  U   .,  and  in  (iii),  regress  1  on  U  U   .  , 

-7  2. 

j=l,...,Q.   (T-O)R^  from  this  regression  is  asymptotically  XZ    under 

The  above  analysis  extends  to  the  case  that  the  restrictions 

cannot  be  written  as  exclusion  restrictions.   Let  6       be  the  (P+D)xl 

o 

vector  of  parameters  in  the  unconstrained  model  and  suppose  that  the 

restrictions  under  H   can  be  expressed  as 

6   =  r(a  )        for  some  ex   e  A 
o       o  o 

p 
where  A  cK   andr:A-t-A.   LetM(x,a)=m.(x.,r(o:)).   Ifoc   is  in 

the  interior  of  A  and  r  is  dif f erentiable  on  int  A  then  a 

heteroskedasticity-robust  test  of  H   is  obtained  as  follows: 

(i)   Estimate  o:      by  NLLS  and  save  V  u^  ( cx^ )  and  the 
o  at   T 

residuals  U.  .   Let  6^.  =  r(cx^)  be  the  constrained  estimator  of  5  : 
t         T       T  o  ■ 

(ii)  Let  V(-m,  =  V^m^(6^)  and  run  the  multivariate 
o  t     6  t   T 

regression 

Vj-m,   on   V  M,      t=l,...,T 
6  t        cx  t 

and  save  the  residuals,  "^^ni^; 

6  t  ■ 

(iii)   Run  the  regression 

1   on   ^rf"^      t=l T 

o  t 

2  "> 

and  use  TR   as  asymptotically  XI    under  H  .   ■ 

Note  that  there  is  perfect  mul ticol 1 ineari ty  in  the  regression 

in  (iii),  so  that  F  of  the  indicators  can  be  excluded  if  the 
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regression  package  used  does  not  compute  R'"'s  for  regressions 
containing  perfect  mul ticol 1 inearity .   Also,  note  that  there  is  no 
need  to  explicitly  compute  the  gradient  of  r   with  respect  to  <x. 
This  is  to  be  contrasted  with  other  methods  to  compute 

heteroskedasticity-robust  test  statistics  (e.g.  White  (19B4,  Chapter 
4)  ).   ■ 

Example  4. 2  (Hausman  test  for  a  conditional  mean):   Suppose,  in  the 

spirit  of  the  Hausman  (1978)  methodology,  two  estimators  of  the 

conditional  mean  parameters  6   are  compared  in  order  to  detect 

o 

misspecif ication  of  the  regression  function.   Because  the  DMLE's 
considered  here  yield  consistent  estimators  of  the  conditional  mean, 
it  is  natural  to  base  a  test  on  the  difference  of  two  such 
estimators.   A  regression  form  of  the  test  can  be  derived  which  does 
not  require  either  estimator  to  be  the  efficient  DMLE.   Also,  only 
one  of  the  QMLE ' s  needs  to  be  computed. 

Suppose  that  G^  is  the  DMLE  from  an  LEF  indexed  by  (a  ,b  ,c  ) 
and  nuisance  parameters  n,  ( rr  )  and  a  second  estimator  is  to  be  used 
from  the  LEF  (a^,b^,c^)  with  nuisance  parameters  •n^(n^).   Then  a 
statistic  that  is  asymptotically  equivalent  to  the  Hausman  test 
which  directly  compares  the  two  estimators  is  obtained  by  taking 

(4.1)    A  (e,n;  =  [  v  c  (m  (  9)  ,  n,  ,  (  tt,  )  )  j~-'-\7  c^  (  m.  (  G)  ,  n.  ^(  n^  )  )  V  m  (  6)  . 
t  m  1   t.      tl  ±  m  -i   t    ■  t^   ^    t*  u 

In  the  notation  of  Theorem  3.1,  tt  =  (tt',tt;L,)'  and  c(m,n)  =  c.(m,n,  ). 

Let  TT^   and  tt^^  denote  the  nuisance  parameter  estimators.   The 
procedure  for  carrying  out  the  Hausman  test  is  as  follows. 
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(i)  Given  the  nuisance  parameter  estimate  n   ,  compute  the  QMLE 

jft,  j-v      y^  y: 

e      using  the  LEF  (a  ,b  ,c  ).   From  6  ,  tt    and  n   ,  compute  the 
weighted  residuals,  U   =  U  Vc   ~,  the  weighted  regression  function 
Vm   =  Vc  '  ^Vm  ,  and  the  weighted  indicator  function  A   =  "^^i^^^  ^i^    ~ 
Vc. .  ^Vc  ^Vm   (note  that  the  indicator  is  just  another  weighting  of 
the  regression  function). 

(ii)  Perform  the  multivariate  regression  of 
A,     on    Vm 
and  keep  the  residuals,  say  A  . 

(iii)  Perform  the  regression 

1     on      U  A      t=l,...,T 

and  use  TR"  from  this  regression  as  XT  under  H   where  P  is  the 

dimension  of  ©  . 
o 

For  concreteness,  consider  a  special  case.   Suppose  Y   is  a 
scalar  count  variable,  and  the  researcher  postulates  the  conditional 
mean  function 

(4.2)  H^:  E(Y,  |X.)  =  eKp(W.e  ) 

u      t   t  to 

where  W   is  a  IxP  subvector  of  X   with  a  maximum  lag  length  that  is 

independent  of  t  (for  cross  section  applications,  W   is  a  subset  of 

Z  ).   Because  Y^  is  a  count  variable,  it  is  sensible  to  use  a 

Poisson  likelihood  function  to  estimate  0  .   However,  for  Y.  to 

o  t 

truly  have  a  conditional  Poisson  distribution,  its  conditional  mean 

and  conditional  variance  must  be  equal ;  this  is  a  fairly  strong 
restriction  which  can  yield  misleading  results  if  it  is  violated. 
For  now,  assume  that  interest  lies  only  in  testing  H  .  If  H   is 
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true,  the  Poisson  QMLE  and  NLLS  both  consistently  estimate  9  ;  this 
is  the  basis  for  a  Hausman  test. 

For  the  Poisson  model,  there  are    no  nuisance  parameters.   In 
the  previous  notation,  a  (m)  =  -m,  b  (y)  =  -log(y!)  and  c  (m)  =  log 
m=   For  NLLS,  the  nuisance  parameter  is  the  variance  associated  with 

the  normal  density,  so  Ti^(n^)  =  a^,  a^(m,a^)  =  -m72a^,  b^(y,a^)  = 

'~>  '7*  *~^  r? 

—  (y^  +  log  (2n)  ) /cr^,  and  c^(m,o-t;)  =  m/cr^.   Then 

^  .^  ^  ^  ^ 

Vm^(e)  =  eKp(W^e)W^,  ^"^tl^®^  ^  e:cp(-W^e),  Vc^^(e,o^)  =  l/CT^ 
and 

VcJ::!'^A.  =  eKp(3/2W.6^)W. /ctJ^,  VcJlf^Vm^  =  exp(  1/2W.  9^  )  W 

where  9^  is  the  Poisson  QMLE.   The  estimate  of  cr~  can  be  ignored 
since  it  appears  only  as  a  scaling  factor.   Define  the  residuals  U 
=  Y^  -  e-.;p(W^e^)  and  the  weighted  residuals  U   =  exp  (-1/2W^9^)  U^  = 
Perform  the  multivariate  regression 

e::p(3/2W  e  )W      on     e>:p(l/2W  9^)W 

and  keep  the  residuals,  say  A . .   Then  perform  the  regression 

1    on       ^t^t 
and  use  TR^  as  Xp.  under  H  . 

Again,  it  is  emphasized  that  this  procedure  does  not  assume 
that  the  Poisson  distribution  (or  the  normal)  is  correctly  specified 
under  H  .   This  is  in  contrast  to  the  usual  method  used  to  compute 
the  Hausman  test  in  this  context.   Hausman,  Ostro  and  Wise  (1984) 
(HOW)  apply  a  regression  method  which  is  very  similar  to  what  would 
be  obtained  from  (3. 28)  in  the  Poisson  context.   The  difference  is 
that  they  perform  the  NLLS  estimation  so  that  the  roles  of  the  two 


DMLE ' s  Are    reversed  (it  is  straightforward  to  carry    out  the  robust 

test  when  9   is  estimated  by  NLLS  rather  than  Poisson  QMLE).   The 
o 

assumption  that  the  Poisson  distribution  is  correct  leads  to  a 
regression  test  that  is  slightly  different  than  (3. 28);  essentially, 
the  procedure  is  to  perform  an  LM  test  for  e;;clusion  of 
eK'p(l/2W  e  )W   in  the  NLLS  model  (also  see  White  (1985b,  pp.25-26)). 
If  the  Poisson  assumption  is  incorrect,  then  this  approach  leads  to 
a  test  with  incorrect  asymptotic  size  for  testing  H  .   In  addition, 
the  HOW  procedure  is  not  consistent  for  testing  the  Poisson 
distributional  specification  in  the  following  sense:   if  the 
conditional  mean  is  correct  but  the  Poisson  assumption  is  violated, 
the  HOW  test  statistic  still  has  a  well-defined  limiting 
distribution  (which  is  not  "X^),  rather  than  tending  in  probability 
to  infinity.   This  result  e>:tends  to  all  Hausman  tests  based  on  two 
QMLE's  that  are  derived  from  the  LEF  class  of  densities.   My  opinion 
is  that  the  Hausman  test  in  the  present  context  should  not  be  viewed 
as  a  general  test  of  distributional  misspecif ication .   A  test  which 
is  useful  -far    testing  distributional  assumptions  beyond  the  first 
moment  is  derived  in  the  next  section.   For  the  Poisson  case,  it 
leads  to  comparing  the  estimated  conditional  mean  and  another 
estimate  of  the  conditional  variance.   ■ 


^o 


Exampl e  4 . 3  (Robust  Davidson-MacKinnon  nonnested  hypotheses  test): 
Let  Y   be  a  random  scalar,  let  X   be  the  predetermined  variables  as 
defined  in  Section  2.   Consider  two  competing  models  for  E{Y  |X  ): 

H^:    E(Y^|X^)  =  m^(X^,e  ),  some  9   e  0,  t=l,2,... 

0  t   t      t   t   o  o     ■ 

H, :    E(Y^|X^)  =  u^(X^,6    ),  some  6       e  A,  t=l,2, 

1  t   t      t   t   o  ■        o 

The  DM  test  looks  for  departures  from  H^  in  the  direction  of  H.  (of 

course  the  roles  of  H   and  H   can  be  reversed).   Assume  that  model 

H^  is  estimated  by  NLLS.   Let  0^  denote  the  NLLS  estimator  of  9  . 
0  '^  T  o 

When  H   is  true,  NLLS  on  model  H   will  yield  an  estimator  6^    which 
generally  converges  to  6^  e  A  (note  that  6^  does  not  have  an 
interpretation  as  "true"  parameters,  but  it  produces  the  smallest 
mean  squared  error  appro:;imation  to  E(Y  |X  )  in  the  parametric  clas; 


Cw.  (;<  ,6):  6    e  A}).   The  estimated  "nuisance  pa 


rameters"  in  this 


setup  are  n^  =  ia'^,6^)    where  a'l    is  the  estimated  "variance"  under 


"o- 


The  DM  tpst  checks  for  nonzero  correlation  between  the 


residuals  U   =  (Y   -  m  (X  ,9  ))  and  the  difference  in  the  estimated 
regression  functions,  y^    {X     ,6    )    -    m  (X  ,9  ).   In  the  notation  of 
Theorem  3.2, 


and 


Vc^(m^(9)  ,Ti^(n)  )  =  1/ct~ 
A^(9,n)  =  m^(6)  -  m^(9) 


If  V(Y  |X  )  =  a"^  is  maintained,  then  a  regression  test  is  available 
from  (3.2S):   run  the  regression 

U^    on    Vm^,  M^('5y)  -  m^(9^) 

and  use  TR~  as  asymptotically  XT  under  H  .   This  is  the  LM  form  of 
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the  usual  Davidson-MacKinnon  test.   If  V(Y  |X  )  is  not  constant 
under  H  ,  then  this  approach  generally  leads  to  inference  with  the 
wrong  asymptotic  size.   The  robust  approach  is 

.^.  '-^ 

(i)   Estimate  model  H   by  NLLS;  save  m  (X  ,9  ),  "^q"!*- ^  ^t '  ®T^  ' 

and  the  residuals  U   =  Y   -  m  (X  ,9  ).   Estimate  model  H.  by  NLLS 

and  save  m^{X.,6^). 

(ii)  Regress  M^(<5j)  -  m  (0  )   on    Vm  ( e  )  ,   t=l,2, ,T  and 

save  the  residuals,  say  A  . 

'^  ••  2  2 

(ill)  Regress  1   on    ^4-^4-  ^^^    "^^  ^^'~    ^^  ^  under  H  . 

This  approach  yields  correct  asymptotic  inference  under  H   in  the 
presence  of  arbitrary  forms  of  heteroskedastici ty ,  and  requires  only 
one  additional  OLS  regression.   The  OLS  regression  in  step  (ii)  is 
the  cost  of  the  robust  procedure.   This  heteroskedasticity-robust 
version  should  be  useful  in  a  variety  of  economic  contexts, 
particularly  when  the  dependent  variable  is  restricted  to  be 
nonnegative.   In  such  cases,  homoskedasticity  is  usually  an 
implausible  assumption.   Rather  than  compare  two  separate  functional 
farms  for  the  dependent  variable  (which,  to  perform  the  test 
correctly,  requires  a  distributional  assumption),  one  can  compute 
two  NLLS  estimators  using  the  same  dependent  variable  and  use  the 
heteroskedasticity  robust  form.   Regression— based  versions  of  the  DM 
test  for  multivariate  models  estimated  by  nonlinear  SUR  which  do  not 
assume  V ( Y  | X  )  is  constant  Are    also  available  from  Theorem  3.1. 
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5.  TESTING  SECOND  MOMENT  ASSUMPTIONS 

The  tests  developed  in  Sections  3  and  4  are  explicitly  designed 
to  detect  misspecif ication  of  the  conditional  mean  without  making 
the  additional  assumption  that  the  distribution  used  in  estimation 
is  correctly  specified.   In  Section  3,  two  regression  forms  of  the 
conditional  moment  tests  were  presented  (see  (3.27)  and  (3.28))  that 
could  be  guaranteed  to  have  a  limiting  'yC    distribution  only  if  the 
second  moment  of  the  assumed  distribution  matches  the  true 
conditional  second  moment  of  Y  .   More  precisely,  the  regression 
tests  in  (3.27)  and  (3.28)  effectively  take  the  null  hypothesis  to 
be 

H^:  E(Y^|X^)  =  m^(X^,©^)  and  V(Y^|X^)  =  [ Vc^ ( m^ ( 9^ ) , n^ ( n^ ) ) ]~^ 

for  some  9   <=  ©,  some  n   en. 
o  o 

The  nuisance  parameters  n  &rG    no  longer  indexed  by  T  because 

o 

they  are  now  "true"  parameters  which,  along  with  9  ,  index  the 
conditional  second  moment  of  Y  .   For  the  validity  of  the  regression 
tests  in  (3.27)  and  (3.28),  the  correct  specification  of  the  second 
.Tioment  is  usually  needed  to  consistently  estimate  the  covariance 
matri;-:  which  appears  in  the  conditional  moment  test  statistics. 
Because  violation  of  the  distributional  assumption  does  not  lead  to 
inconsistent  estimates  of  a  correctly  specified  mean,  the  CM  tests 
for  the  conditional  mean  based  on  (3.27)  or  (3.28)  are  inconsistent 
for  the  alternative 


.iO 


H'  :    E(Y^|X^)    =    m.(X.,G    )     for    some    G      e   0    but    V(Y^|X^)    7^ 
1  tttto  o  tt. 

[Vc.(m.(e    )  ,-n.(n)  )]"■'■    for    all    n    e    H, 
t      tot 

Because  the  Hausman  test  is  a  CM  test,  it  is  also  inconsistent  for 
testing  distributional  aspects  beyond  the  first  moment.   A  more 
powerful  test  is  needed  for  detecting  departures  from  the  second 
moment  assumption. 

Such  a  test  is  obtained  by  applying  White's  (19B2)  information 
matrix  (IM)  testing  principle.   Actually,  the  focus  here  is  on 
second  moments,  so  that  the  test  derived  below  is  closer  to  the 
White  (1980a)  test  for  heteroskedasticity  extended  to  nonlinear 
regression  models.   The  difference  is  that  the  tests  derived  below 
are  computable  from  linear  regressions  while  taking  only  H'  as  the 
null  hypothesis  (i.e.  there  is  no  need  to  add  auxiliary  assumptions 
under  the  null  in  order  to  obtain  a  simple  regression  form  of  the 
test) . 

In  the  spirit  of  White  (1980a),  a  test  is  based  on  two 
consistent  estimators  of  the  asymptotic  covariance  matrix  of  G^ 
under  the  hypothesis  that  the  first  two  conditional  moments  are 
correctly  specified.   Thus,  interest  lies  in  second  moment 
misspecif ication  which  invalidates  the  use  of  the  usual  standard 
errors  calculated  for  the  QMLE  G^  (although  the  approach  easily 
generalizes  to  more  general  second  moment  tests).   Let  rr^  be  the 
nuisance  parameter  estimator,  and  let  all  other  quantities  be 
defined  as  before. 


27 


Consider  the  difference 

(5.1)    t""*^  E  Vm^Vc^U^U^Vc  Vm    -  t""""  J3  Vm'Vc  Vni 

where  all  quantities  are  evaluated  at  (e^.TT^)  (recall  that  U.  and 
7m.  do  not  depend  on  n^).   If  the  first  two  conditional  moments  of 
Y.  given  X   are  correctly  specified,  then  the  difference  in  (5.1) 
tends  to  zero  in  probability  as  T  -►  oo  and  the  standard  errors  for  ©-j- 
which  Are    computed  under  the  information  matrix  equality  will  be 
asymptotically  valid.   If  either  the  conditional  mean  or  variance  is 
misspecified  then  (5.1)  will  typically  not  converge  to  zero 
(although  there  are  directions  of  misspecif ication  for  which  this 
statistic  will  still  tend  to  zero).   Thus,  a  test  of  correctness  of 
the  first  two  moments  can  be  based  on  a  suitably  standardized 
version  of  (5.1).   Taking  the  vec  of  the  t  th  difference  in  (5.1) 
yields 


^.       .'^    J^.  .-N       J^.  J^  JT;  y-.  /%. 


(5.2)   vec(Vm^Vc^U^U^7c  Vm   -  Vm' 7c  Vm  )  '  = 


y^  x>. 


vec(Li^U^  -  Vc^-'")'  [(Vc^Vm^)  ®  (Vc^Vm^)] 

where  the  relationship  vec(ABC)  =  (C®  A)vec(B)  for  conformable 

matrices  has  been  used.   Under  H'.  ,  E(U°'U°|X^)  =  [Vc^(e  ,n  )]"'. 

O  ■     t   t   t        too 

In  general,  a  statistic  based  on  (5.2)  will  have  a  limiting 
distribution  that  depends  on  the  limiting  distribution  of  (G^,t;^). 
Because  it_  may  come  from  a  variety  of  sources,  this  dependence  makes 
general  derivation  of  the  limiting  distribution  of  the  IM  statistic 
tedious.   More  importantly,  the  resulting  test  statistic  is 
computationally  burdensome.   A  statistic  which  does  not  depend  on 


2B 


the  limiting  distribution  of  (©  ,n  )  would  be  particularly 
convenient  in  this  case. 

To  derive  such  a  statistic,  the  approach  used  in  Section  3  is 

—1  o   o 

modified.   Note  that  CVc,(e  ,n  )]    plays  the  same  role  for  U.'U. 

that  m  (e  )  plays  for  Y  .   Define 

X.(e,TT)  =  vec  C7  c.  (m.  (e),Ti.  (n))]""'-' 
t   "  m  t   t      t 

A^(e,n)  =  C(Vc^(©,n)Vm^(e,TT)  )  »    ( Vc^(©,  n) '7m^(  0,  tt  )  )  ]  . 

In  Section  3,  a  statistic  which  did  not  depend  on  the  limiting 

distribution  of  the  parameter  estimates  was  obtained  by  first 

removing  the  influence  of  the  gradient  of  the  conditional  mean  on 

the  indicator.   Things  are  more  complicated  here  because  U  '  L)   is  no 

longer  observed,  but  only  estimated.   Nevertheless,  because  the 

gradient  of  U.  (e)'LJ.  (G)  evaluated  at  Q      is  uncorrelated  with  any 
t      t  o 

function  of  X  ,  the  same  strategy  works. 

If  jr.  {9,n)  has  a  zero  derivative  with  respect  to  a  certain 

parameter  then  the  estimator  of  this  parameter  has  no  effect 

asymptotically  on  the  usual  IM  statistic.   In  what  follows,  7^;  (G.rr) 

contains  only  the  nonzero,  nonredundant  elements  of  the  gradient  of 

^j_  with  respect  to  both  ©  and  tt.   Define 

,  T   .   ,.  .  .  T 

r_  = 


k=i      J     t=i 


and  let  (p  (e,TT,r)  denote  the  Q  nonredundant  elements  of 

veccu^(e)'u^(e)  -  K^(e,TT):' EA^(e,n)  -  vx^(©,TT)r]. 
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Theorem  5.1:   Suppose  that  the  following  conditions  hold; 

(i)  Regularity  conditions  B.l  in  the  appendix; 

(ii)  For  some  0   e  ©,  n   e  fl,  and  t=l,2,...., 
o     ■   o     ■ 


and 


E(Y^IX^)  =  m^(X^,e^) 


V(Y^IX^)  =  [V^c^(m^(X^,e^),n^(X^,TT^))]  ^  = 


Then 

2 

where  ?^  is  TR   from  the  regression 


on    ip    ,   t=l ,  .  .  .  ,T 


j^  y\  y\ 


and  (p  =  «p^(e  ,  n^,r  ).   ■ 

The  method  for  applying  Theorem  5.1  is  as  follows: 


/^     .'^ 


(i)  Given  tt  ,  compute  the  QMLE  G  ,  U  ,  V^  ,  and  A   as 
defined  above; 

(ii)  Perform  the  multivariate  regression  of  A.  on  Vif   and 

save  the  residuals,  say  A  ; 

(iii)  Perform  the  OLS  regression 

1  on         <p  t=l ,  -  =  -  ,  T 

and  use  TR^  as  XII  <,     where  Kp      is  IxQ  and  contains  n  on  redundant 

elements  of  vec  (U'U   -  Vc ^ ) ' A  .   ■ 

t  t      e     t 


Theorem  5.1  gives  a  simple  method  for  testing  correct 
specification  of  the  conditional  second  moment.   Only  least  squares 
regressions  are  needed  to  compute  the  statistic.   Calculation  of  Vx. 
is  typically  straightforward.   The  parameters  n   need  not  be 
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estimated  by  an  efficient  procedure,  so  that  Theorem  5.1  is 
applicable  to  the  negative  binomial  generalization  of  the  Poisson 
regression  model  as  developed  by  GMT  (1984b).   As  is  the  case  with 
the  usual  IM  test,  one  may  exclude  indicators  and  appropriately 
reduce  the  degrees  of  freedom  in  the  yT   distribution. 

Example  5.1;   It  is  interesting  to  apply  Theorem  5.1  to  a  univariate 

nonlinear  model  that  has  been  estimated  by  NLLB.   In  this  case, 

^^.(^?^)  =  TT  =  cr*"  SO  that  the  only  nonzero  element  of  7>f.  is   1-   The 

indicator  A   is  Vm  ( X  ,9  ) ' 7m  ( X  , 0  ) ]/a^  where  a   is  the 

-1   '^2 
estimator  T   E  U  .   Theorem  5.1  leads  to  the  regression 

t=l 
(5.1)        1   on    (Uj  -  cr:p)(?^  -  ^^) 

where  C.  is  a  vector  of  nonconstant,  nonredundant  elements  of  Vm' Vm 

-      -1   ■"• 
and  t^^   =  T      E  *>•(--   This  procedure  is  asymptotically  equivalent  to 

t=l 
the  regression  form  of  the  White  test  for  heteroskedasticity  for 

nonlinear  regression  models  under  the  additional  assumption  that 


Er(U°)^|X^]  is  constant  (see  Domowitz  and  White  (1982)). 
Interestingly,  the  slight  modification  in  (5.1)  (which  is 
essentially  the  demeaning  of  the  indicators  C  )  yields  an 
asymptotically  yT    distributed  statistic  without  the  additional 
assumption  of  constant  fourth  moment  for  U  .   In  the  case  of  a 
linear  time  series  model,  the  demeaning  of  the  indicators  yields  a 
statistic  which  is  asymptotically  equivalent  to  Hsieh's  (1983) 
robust  form  of  the  White  test,  but  the  above  statistic  is 
significantly  easier  to  compute.   Rarely  would  we  care  to  assume 
anything  about  the  fourth  moment  of  Y  ,  so  that  the  robust 
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regression  form  in  (5.1)  seems  to  be  a  useful  modification. 

Example  5.2:   Consider  the  Poisson  regression  model  where,  for  a 

scalar  count  variable  Y  ,  it  is  assumed  that  Y   conditional  on  X 

has  a  Poisson  distribution  with  mean  E(Y  |X  )  =  e;<p(W  0  ).   As  noted 

in  Section  3,  the  Hausman  test  is  inconsistent  against  alternatives 

for  which  the  mean  is  correctly  specified  but  the  distribution  is 

otherwise  misspec'if  ied .   A  reasonable  approach  is  to  compare  the 

conditional  mean  and  another  estimate  of  the  conditional  variance. 

In  this  case  there  are  no  nuisance  parameters  and  Vc  (m  (9))  = 

e:<p(-W.e)=   Therefore,  Vx^O)  =  ejcp(W^e)W^  and  Vc^(e)Vm^(e)  =  W.. 
t  t  tt        tt        t 

Let  <  be  a  row  vector  of  nonredundant  elements  of  W'W  ,  let  G^  be 
the  Poisson  DMLE  and  let  L)  =  Y  -e;:p(W  G^).  The  testing  procedure 
is 

(i)   Perform  the  multivariate  regression  of  C   on 
exp(W  9  )W  ,   t=l,...,T  and  keep  the  residuals,  say  ?.. 

(ii)  Run  the  OLS  regression 

1     on      (i^t  "  ^-'P(W^9^)  )<^ 
and  use  TR*"  as  asymptotically  Yi    under  H   where  Q  is  the  dimension 

of  C  .   ■ 
't 

In  Examples  5.1  and  5.2,  the  test  statistics  essentially  check 
whether  the  information  equality  holds  for  the  vector  of  parameters 

9  ,  and  is  not  consistent  against  alternatives  for  which  this 

o '  ^ 

equality  holds  but  the  distribution  is  otherwise  misspecif ied .   If 
in  Example  5.1  the  regression  function  is  correctly  specified  and 
the  conditional  variance  of  Y   given  X   is  constant,  then  the 


procedure  outlined  above  can  not  be  expected  to  detect  other 
departures  from  conditional  normality,  such  as  skewness  or  kurtosis. 
In  E>cample  5.2,  the  equality  of  the  conditional  mean  and  conditional 
variance  is  being  checked.   Other  departures  from  the  Poisson 
distribution  might  not  be  detected  (nor  would  they  be  very 
interesting ) . 

Before  ending  this  section,  it  is  useful  to  note  that  Theorem 
5.1  is  applicable  to  much  more  general  specification  tests  for 
conditional  variances,  such  as  the  Lagrange  Multiplier  tests  in 
Breusch  and  Pagan  (1979)  and  the  ARCH  tests  of  Engle  (1982).   One 
merely  replaces  the  indicator 

[(Vc^(e,n)7m^(e,n)  )  ®(  Vc^O,  n)  Vm^  (6,  rr )  )] 
by  whatever  is  desired,  as  long  as  the  indicator  depends  only  on  the 
predetermined  variables  X   and  parameters  (6,tt).   The  result  is 
tests  for  heteroskedasticity  which  Are    robust  to  departures  from  the 
distributional  assumption. 

EXAMPLE  5.3:   Let  Y   be  a  scalar,  and  suppose  the  null  hypothesis  is 
H^:  E(Y^|X^)  =  m^(X^;e^),    some  e^  e  © 

V  (  Y  .  I  X  .  )  =    cr"^         some  cr'^    >    0  t=l  ,2 

t  •  r      o  o 


Let  ©_  be  the  NLLS  estimator  of  ©  ,  and  let  all    be  the  usual 

estimator  of  cr"    ba  =  ed  on  the  sum  of  sauared  residuals.   The  LM  test 
o  ^ 

for  Qth  order  ARCH  is  based  upon 

Z      (U      -    <:j~)U~:_  j=l,...,Q. 

t=Q+l         I   t:  J 

The  usual  LM  statistic  is  (T-D)R*"  from  the  regression 
(5.2)  uj   on   1,  ur  , uj  ^     t=Q+l T. 


Because  X  .  O,  tt  )  =  n  =  a*" ,  the  statistic  that  is  derived  from  Theorem 
5-1  is  ( T-Q  )R*'  from  the  regression 

(5.3)     1   on   (U^-ct:j:)  (U^_j^-ct:j:)  ,  ...,  (U^-arf)  (U^_l_j-cj^)   t=Q+l,...T. 

The  regression  based  form  in  (5.3)  is  robust  to  departures  from  the 
conditional  normality  assumption,  and  from  any  other  auxiliary 
assumptions,  such  as  constant  conditional  fourth  moment  for  U.  This 
is  to  be  contrasted  with  the  test  derived  from  (5.2).   ■        ' 


6.  CONCLUSIONS 

This  paper  has  developed  a  general  class  of  specification  tests 
for  dynamic  multivariate  models  which  impose  under  H   only  the 
hypotheses  being  tested  (correctness  of  the  conditional  mean  or 
correctness  of  the  conditional  mean  and  conditional  variance) .   The 
computationally  simple  methods  proposed  here  should  remove  some  of 
the  barriers  to  using  robust  test  statistics  in  practice. 

The  general  approach  used  here  has  several  other  applications. 

"■"' —  j\ 

In  particular,  the  QMLE  ©^  can  be  replaced  by  any  VT— consistent 

estimator.   This  is  useful  in  situations  where  the  conditional  mean 

parameters  are  estimated  using  a  method  different  than  QMLE.   An 

example  is  a  log-linear  regression  model:   let  ©  =  {^,cj^),     and 

(i.l)  loa  Y,  IX,   ^   N(X,B  ,o-'), 

t   t         too' 

so  that 

(6.2)  E(Y^|X^)  =  eKp(a^/2  +  X^P^). 

It  is  easy  to  estimate  Q      by  MLE  in  this  case  since  (6.1)  suggests 

OLS  of  log  Y   on  X  .   Because  we  are    ultimately  interested  E(Y  |X 


)  , 


QMLE  in  this  example  corresponds  to  NLLS  of  Y   on  exp(X  x)  (provided 


that  X   contains  a  one).   When  comparing  the  log-linear 
specification  to  a  linear — linear  model  E(Y  |X  )  =  X  6  ,  it  is  useful 
to  use  e>;pression  (6.2).   The  robust  Davidson-MacKinnon  test  derived 
in  Section  4  is  immediately  applicable  to  the  functions  e:<p(cr!j:/2  + 
X.P-j.)  and  X.6^  (no  matter  which  model  is  taken  to  be  the  null), 
where  all  estimates  are  obtainable  from  DLS  regressions. 

The  approach  used  in  this  paper  also  seems  to  generalise  to 
models  that  jointly  parameterize  the  conditional  mean  and  variance 
and  are  estimated  by  QMLE  using  a  conditional  normality  assumption. 
The  multivariate  ARCH-in-mean  models  of  the  type  used  by  Bollerslev, 
Engle  and  Wooldridge  (1988)  fall  into  this  class.   Having  robust 
Lagrange  Multiplier  tests  for  these  models  would  allow  specification 
testing  of  the  conditional  mean  and  variance  without  taking  the 
normality  assumption  seriously.   This  research  is  currently 
progress. 


MATHEMATICAL  APPENDIX 

For  convenience,  I  include  a  lemma  which  is  used  repeatedly  in 
the  proofs  of  Theorems  3.1  and  5.1. 

Lemma  A.  1 ;   Assume  that  the  sequence  of  random  functions  CQ-j- ( W^,  9)  : 

e  e  9,  T=l,2,...},  where  Q^(W^,-)  is  continuous  on  ©  and  ©  is  a 

p 
compact  subset  of  K  ,  and  the  sequence  of  nonrandom  functions 

xQ^(©):  ©  e  ©,  T=l,2,...},  satisfy  the  following  conditions: 

(i)  sup  |Q  (W  ,©)  -  Q  (©)]|  5  0; 
©€© 

(ii)   {0(6):  e  e  ©,  T=l,2,...}  is  continuous  on  © 

uniformly  in  T. 

Let  ©^  be  a  sequence  of  random  vectors  such  that  ©^  -  ©_  -►  O 

where  C©j}  <=  ©-   Then 

D^(W^,©^)  -  Qy(e°)  5  0. 

Proof:  see  Wooldridge  (1986,  Lemma  A.l,  p.229)„   ■ 

A  definition  simplifies  the  statement  of  the  conditions. 

Definition  A.l:   A  sequence  of  random  functions  -[q.CY  ,X  ,©):  ©  e  ©, 

t=1.2, ...},  where  q.(Y,X,,-)  is  continuous  on  ©  and  ©  is  a  compact 

p 
subset  of  K  ,  is  said  to  satisfy  the  Uniform  liJeak  Law  of  Large 

Numbers  ( UWLLN )  and  Uniform  Continuity  ( UC )  conditions  provided  that 

T 
(i)  sup   |T    2  q^(Y  ,X  ,©)  -  E[q  (Y  ,X  ,©)31  5  0 
©€©       t=l 

and 

-1  "^ 
(ii)   [T    J2   E[q  (Y  ,X  ,©)]:  ©  e  ©,  T=l,2,...}  is 

t=l 
continuous  on  ©  uniformly  in  T.   ■ 
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In  the  statement  of  the  conditions,  the  dependence  of  functions 
on  the  predetermined  variables  X   is  frequently  suppressed  for 
notational  convenience.   Also,  c  (©,n)  is  used  as  a  shorthand  for 
c  (m  (9)  ,-n.  (n)  )  .   Similarly  for  a  (e,TT).   If  a  gradient  operator  is 
not  subscripted,  the  derivative  is  with  respect  to  all  parametric 
arguments  of  the  function  (either  6  or  (e,n)).   If  a(9)  is  a  IxL 
vector  of  the  Pxl  vector  9  then,  by  convention,  V  a(9)  is  the  LxP 
matrix  '7-[a(9)'].   If  A(9)  is  a  Q  x  L  matrix  function  of  the  Pxl 
vector  9,  the  matrix  V  AO)  is  the  QL  x    P  matrix  defined  as 

VqA(9)'  =cv^A^(e)':  ...  iv^a^o)'] 

where  A. (9)  is  the  jth  row  of  A (9)  and  V^A.(9)  is  the  L  x  P 
gradient  of  A. (9)  as  defined  above.   For  simplicity,  define 

'7^c(m,Ti)  =  V  [7  c(m,Ta)'] 
m    "       mm 

V-^  c(m,Ti)  =  V  C7  c(m,Ti)'  ] 
mT\  Ti   m 

V;m^(9)  H  V^[V^m^(9)'] 

Note  that  V^cCm,^)  is  K*"  x  K,  V~  c(m,n)  is  K^  x  J,  and  V^m.  (©)  is 
m  mT\  ■        9  t 

KP  X  P. 


Conditions  A . 1 : 

(i)  ©  and  fl  are  compact  with  nonempty  interior; 

I  -n-  n  -4-1  ) 
(ii)  [m  (x  ,9)  :  x   e  K^  ^'   "  ,  ©  e  ©}  is  a  sequence  of 

real-valued  functions  such  that  m^(-,9)  is  Borel  measurable  for  each 

9  e  ©  and  m^(x^,-)  is  twice  continuously  dif f erentiable  on  the 
interior  of  ©  for  all  x  ,  t=l,2,...; 

(iii)   The  functions  a.zJ'^-.J^   ■*    ER  and  c:>L;-:^  -♦  K  ^  are  such  that 


:7 


(a)  a  is  continuous  on  Jh<J^   and  for  each  t\  ^  -^ ^    a(  ■  ;t\)  is 
continuously  dif f eren tiable  on  the  interior  of  J^; 

(b)  c  is  twice  continuously  dif f erentiable  on  the  interior 
of  J^<J{\ 

(c)  For  all  (m,Tl)  e  interior  ./Hk^,  mV  c(m,Ti)  =  -^7  a(m,Ti); 

m  m       ■ 

(iv)  T-''*-(TT_  -  n°)  =  0  (1); 
-  '      '      P 

(V)   (a)  {a^(e,n)  +  m^  ( e^)  c^(©,  n )  }  and  {U°c^(e,TT)} 
satisfy  the  WULLN  and  UC  conditions; 

(c)  O   is  the  identifiably  unique  maximizer  (see  Bates  and 


White  (1985))  of 

_.  T 


.    T 
T  ^  J:  E[a  (e,n°)  +  m^  (  G^)c^  (  9, -n"  )  ]  ; 
t=l 


(vi)  e   is  in  the  interior  of  ©,  and  {n^}  is  in  the  interior  of 
o  T 

n  uniformly  in  T; 

(vii)  (a)  •C^Qm^(e)'V^c^(e,TT)VQm^(e)}, 

[V  m  (e)'V^c,(e.TT)'[T,  »  U.  (G)'  ]V  m.  (G)}, 
©  t      m  t         v..  t       B  t 

•CV^m,  (8)'  CI.,  ®  V  c,  (e,n)U.  (6)'  ]},  and 

■^^TT-^t^")'Wt^®'"^'"K  *  u^(^)':iVt(^^^ 

satisfy  the  WULLN  and  UC  conditions. 

T 

(b)  A°    =    CT    ^    ^  EC^^m^O    )'V    c^O    ,  n° )  V   m     ( 9    )]>    is    0(1) 

I  +-  — 1  ©to         mtol©to 

and    uniformly    positive    definite; 

(c)  t"^''^    j:  "^^""'^vG    )'7   c^(G    ,rT°)U°'     =    0^(1): 

,r^,    Gto         mtoTt  p 

(viii)        (a)     {V^m.  (G)' V^c.  (G,TT)'  [I,,    ®   U^(G)'  ]V_A,  (G,n)}, 
ot  mt  K  t  ft 

[V    -n,  (TT)' V^    c.  (G,n)[I^,    ®    U.  (9)'  :V    A,  (G,TT)}, 
n    t  mn    r       ■  K  t  G    t 

■[■7    A    (©,n)'  [I^,    ®    V    c.  (G,n)U,  (G)'  ]},     and 
©    t       ■  K  m    t       ■  t 

[V    A.  (©,TT)'  [T  ..    ®    V    c.  (9,TT)U,  (G)'  ]} 
n    t       "  K  m    t       "  t 


>8 


satisfy  the  WULLN  and  UC  conditions. 
(i;0  (a) 

is  0(1)  and  uniformly  positive  definite; 

■CA^(e,n)'V  c^(e,n)U^(e)'U^(e)V  c^(e,n)V^m^(e)},  and 
t        mt"    t      t     mt       et 

{V^m.  (e)'V  c.  (e.n)U.  (G)'U^(G)V  c  .  (  G,  tt  )  V^m  .  (  9)  } 
9t      mt       t      t     mt"    St 

satisfy  the  UWLLN  and  UC  conditions.   ■ 


Proof  of  Theorem  3.1;   The  major  task  of  the  proof  is  establishing 
the  validity  of  equation  (3.24).   For  notational  simplicity,  we 
explicitly  consider  the  case  K  =  1;  the  case  K  >  1  is  similar  but 
notational ly  cumbersome. 

By  a  weak  consistency  analog  of  Bates  and  White  ( 1985 , Theorem 
2.2),  assumptions  (i),  (ii),  (iii),  (iv)  and  (v)  imply  that  G^  5  G  . 


Consistency  of  G   and  (vi)  imply  that 

T 


(a.l)  fT   E  '^J^l'^   c\u^  =  ol-»l   asT-co. 

L  ^^^  G  t  m  t  t      J 


y\  .-X 


EKpandinq  the  score  S_(©_,n_)  in  a  mean  value  expansion  about  G 

!     !     !  Q 

(Jennrich  (1969, Lemma  3))  yields 


j^     -^ 


(a. 2)  S  (GT.,n_)  =  S_  ( ©  ,n_)  +  H-r(G^  -  G  ) 

I    I   T       T   o '  T      T   T     Q 

where  H^  is  the  hessian  with  respect  to  G  with  rows  evaluating  at 

mean  values  on  the  line  segment  connecting  G^  and  G  .   For  any  G  * 

To 

int  ©, 


h.  (e,n)  =  -Vm.  (e)'7  c.  (e,n)Vfn.  (e) 
t  t      m  t        t 

+  U^(e){Vm^(G)'7"'c^(e,n)7m^(e)  +  V  c^  (  G,  n  )  V"m  .  (  9)  >  . 

Because  6^.    5  9  ,  n^.  -  n^.  5  0,  and  all  components  of  h,(©,TT) 
T       o '   T     I      ■  t: 

satisfy  the  UWLLN  and  DC  conditions  by  (vii.a),  it  follows  that 

H^/T  +  A°   5   0. 
By  (vii.b).  A—  is  0(1)  and  uniformly  p.d.   Therefore,  H-p  is 
nonsingular  with  probability  approaching  one,  so  by  Lemma  A»l, 
(H  /T)~   +  A°~    5   0.   Combined  with  (a.l)  and  (a. 2)  this  implies 

that 

1/2  '^  o-l  -1/'^       '^ 

(a. 3)  T-^'^(e^  -  e  )  =  A°   T  -^'^S^O  ,tt^)   +   o  (1). 

To      T         T   o  ■  T         p 

Next,  note  that  by  a  mean  value  expansion  about  tt^, 

—  1/2  ^  —\/'^  n  ••  1/"^    '^  D 

T    ^       S_(e    ,TT^)    =    T    ^^-S^(e    ,TT°)    +    [V   S^/T]    I'^'^i-n^    -    "x)       "*"    °    (^^ 
loT  ToT  TiT  T  I  p 

where  V  S_  has  rows  evaluated  at  (9  , n^   )  and  n^    are  mean  values 
n  T  o   T  T 

between  tt_  and  n_.   By  (vii.a),  V  5^(9  ,n)  satisfies  the  UWWLN  and 
T       T  •   n  t   o 

UC    conditions  so  that  by  (iv)  and  Lemma  A.l.  [V  S^/T]  - 

rr  1 

EC'7^S^(9^,iT°)/T]  =  o  (1).   But,  under  H^,  EC  V^s^  ( 9^,  tt°  )  |  X^]  =  0,  so 

that  V  S^/T  =  o  (1).   Combined  with  (iv)  this  shows  that 

rr    T  p 

(a. 4)  T~^^-S_(9    ,n^)    =    T"^^-S^(9    ,n°)    +    o    (1). 

ToT  T       n  ■     T  p 

Substituting     (a. 4)     into     (,3..Z)     gives    the    standard    asymptotic 

eguivalence    for    the    QMLE 

l^'^'re^    -    9    )    =    A?-V^/-S°      ^      o    (1); 
ToT  T  p  ■ 

in  particular,  T"*"  ^(9^.  -  9  )  =  0  (1)  by  (vii.c). 

I     o      p 

Next,  consider  equation  (3.24).   First,  (vii:b)  guarantees  that 
B   exists  for  sufficiently  large  T  and  is  0(1).   Rewriting  (3.24) 
gives 
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-1/2       ■•  l/2'^       '1/2   ■'"   o     ■"■•1/2-'- 
(a.6)   &-r  =  T      j:  (V  c   ^A   -  V  c^' 'V  m  B°)  '  V  c  /  U  ' 
T         ^.  —  t^^t    t     mt    6tT    mt    t 

1/2  ^    o    -1      ^         ^    ^ 

By  (a.l),  the  second  term  on  the  right  hand  side  of  (a. 6)  is  zero 

with  probability  approaching  one.   Taking  a  mean  value  expansion 

about  (6  .n^.)  of  the  first  term  on  the  right  hand  side  of  (a. 6)  and 
o  ■  I 

applying  Lemma  A.l,  (vii),  and  (viii)  yields 
<a.7,   S,  =  T-^'2j^,A°  -  V^»°B°)-V^C°U° 

I.— J.  •' 

-     [t-I    E   EV   A°'V   cX^V-Cn^   -    n    ) 
L         +.=  1        rrtmttj  T  o 

+       T  ^    [A.  '7^    C.7    Ta.U.]     T  (TT_    -    TT     ) 

L        t-1       tmntTTttJ  T  o 


o> 


+  o  (1) 

p 


Under  H  ,  E[U  |X  ]  =  0,  so  that  the  first  term  in  each  of  lines  two. 

three,  five,  and  six  of  (a. 7)  has  zero  expectation  by  the  law  of 

iterated  expectations.   Because  each  of  these  terms  satisfies  the 

WLLN,    T^^"^(e^    -    e    )    =    0    (1),    and    T'^''"^(n^    -    n    )    =0     (1),     the 
lop  Top 

expressions  in  lines  two,  three,  five,  and  six  are  all  o  (1).   By 

p 

definition  of  B^, 

SO  that  the  term  in  line  four  is  also  o  (1).   This  establishes  that 

P 
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"L    =   J-^'-l    IA°-    V^n,°B°)'S^   c°U°      -.   o  (1) 
T  ~    t     9tT    mtt       p 

so  that  (3.24)  in  the  te;;t  holds.   By  (ix.a),  Z^  is  0(1)  and 

uniformly  p.d.   By  (i:<.b),  2L.    ^^         ■*      N(0,I  ).   Condition  (i:c.c) 

ensures  that  SL.  is  a  consistent  estimator  of  ZL.,  and  consequently  is 

positive  definite  with  probablity  approaching  one.   Therefore, 

a'SL.  &^  -►  X^,  and  this  completes  the  proof.   ■ 

In  what  follows,  let 

v^(G)  =  [vec  U^(e)'U^(G)]' 

X.  (e,n)  =  Cvec  7  c^  (  9,  n  )"''■]  ' 
t   "  m  t   " 

A^(e,n)  =  [(7c^(e,n)Vm^(e))  ®  ( Vc^( G, n) Vm^(  G)  ) ] . 

Conditions  B.l;   Conditions  (i)-(vi)  in  A.l  hold.   In  addition, 
(vii' )   The  following  functions  satisfy  the  WULLN  and  UC 

conditions : 

•[Vx^^(9)A^(G,TT)},     •[Vi:^(G,n)A^(9,TT)},     [Vv^  (  G)  Vif  ^(  6,  n)  '  }  , 
■[VK^(G,n)"7i;^(G,n)},     CVi:^  (  G,  n)  '  A^  (  9,  n  )  }  , 

{7A^(G,n)'  CIp^^    ®   v^(G)']},     [VA^  (  G,  n)  '  [  Ip^^^    ®    if  ^  (  G)  '  ]  }  , 
;:7'^i;^(e,n)'  [Ip_^^    e>    v^(G)']},     ;:V'^i:^(G,n)'  CIp^^^    <&    i;^(G)'3}; 

(viii'  )       T~-^^~J:    [v.  (9    )     -    if.  (9    ,n    )]'Vi:^(9    ,n     )     =    0     (1); 

to  too  too  p 

_    T 

(i;<')        (a)       S?    =   T      J]  E[  .^^  (  9^  ,  tt^  ,  r°  )  '  ^,  (  9^  ,  n^  ,  r°)  ]    is    0(1)    and 

uniformly    p.d.; 

(b)  Z^~^^~T~^^^Z    '*'t^®o'"o''^T^       "      N(0,Iq); 

(c)  [iiJ    (9,rT,r)  '  L|j    (9,rt,r)  }    satisfies    the    WULLN    and    UC 
conditions . 
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Proof  of  Theorem  5.1;   For  notational  simplicity,  again  consider  the 

1/2  "^ 
case  K  =  1.   First,  because  (i)-(vi)  of  A.l  hold,  T    ( G^  -  9  )  = 

T  D 

0    (1)    and    J^^'^iTT^    -    TT    )    =0    (1).       Next,    note    that 
P  Top 

It    is    straightforward    to   show    that    (vii')    and    Lemma    A.l    imply    that 

(r^    -    r_)    =    o    (1).       Next,    a   mean    value   expansion,     (vi'),     (vii'), 

T''"'''^(e^   -    e    )    =   0    (1)    and    T''"''^(tt^   -    n    )    =   0    (1)     imply    that 
lop  Top 

t'^^^T.  cUo-te-r)'  -  i;4.(e^,n^)3vx,  (e^,n^)]  =  o„(i)- 

_  tT  tTT  tTT  p 

Expanding  the  first  term  on  the  right  hand  side  of  (a.B)  about 

(9  ,n  )  yields 
o  ■  o 

t'^^'Z   CU^(9^)^  -  ^^(9^,n^)]nA^(9^,n^)  -  r°VK^( 9^, n^) ] 
_~l/2      o  2     o     o     o   o 

=  T        z  nu  )     -  i;  )3[A    -  r  V5(  ] 
t=i 

^  T~^   I   C(U°)-  -  K°)KVA°  -  r°v^^°rT^^-(6  -  6^) 

-  T"^  Z   2Vm°'U°rA°  -  r°V^°3'T^^-(a^  -  9^) 

-  T-^  I   CA°  -  r°V,°]'V,°T^/-(6  -  6^)    +    Op(l) 

where  6  =  (9',tt')'.   Under  H^^,  E[U°|X^:  =  0  and  E[U°~|X^3  =  V°; 

therefore,  the  second  and  third  terms  on  the  RHS  of  (a.B)  are  o  (1). 

P 

r^  is  defined  so  that 
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t=i 

Therefore,  the  fourth  term  is  o  (1).   We  have  shown  that 

P 

(a. 9)      T    ^E  [U^^  -  5(^3[A^  -  T^W^D  = 

T-^^4  i:U°-  -  ^°][A°  -  r°V^°]   +   Op(l). 

Recalling  that  hj°  denotes  the  nonredundant  elements  in  the  summation 

on  the  right  hand  side,  it  follows  from  (ix'.a.b)  that 

H-    "T    ^r  ip.   -»   N(0,I  ).   Combined  with  (ix'c),  this  shows  that 

^         t=l  ^  ° 

■^    d    "^ 

^-   ->   >C  under  H   and  completes  the  proof.   ■ 
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